Overview:
This page contains the results of CoNGA analyses.
Results in tables may have been filtered to reduce redundancy,
focus on the most important columns, and
limit length; full tables should exist as OUTFILE_PREFIX*.tsv files.
Command:
/Users/Nick/conga/scripts/run_conga.py --gex_data merged_COVID_gex.h5ad --gex_data_type h5ad --clones_file merged_COVID_clones.tsv --organism human --graph_vs_graph --outfile_prefix ./CoNGA.output --no_kpca
Stats
num_cells_w_gex: 11282
num_features_start: 36601
num_cells_w_tcr: 10155
min_genes_per_cell: 200
max_genes_per_cell: 2500
max_percent_mito: 0.1
num_filt_max_genes_per_cell: 3463
num_filt_max_percent_mito: 114
num_TR_genes: 151
num_TR_genes_in_hvg_set: 92
num_highly_variable_genes: 1396
num_cells_after_filtering: 6578
num_clonotypes: 5453
max_clonotype_size: 135
num_singleton_clonotypes: 4864
nbr_frac_for_nndists: 0.01
num_gvg_hit_clonotypes: 80
num_gvg_hit_biclusters: 7
graph_vs_graph
Graph vs graph analysis looks for correlation between GEX and TCR space
by finding statistically significant overlap between two similarity graphs,
one defined by GEX similarity and one by TCR sequence similarity.
Overlap is defined one node (clonotype) at a time by looking for overlap
between that node's neighbors in the GEX graph and its neighbors in the
TCR graph. The null model is that the two neighbor sets are chosen
independently at random.
CoNGA looks at two kinds of graphs: K nearest neighbor (KNN) graphs, where
K = neighborhood size is specified as a fraction of the number of
clonotypes (defaults for K are 0.01 and 0.1), and cluster graphs, where
each clonotype is connected to all the other clonotypes in the same
(GEX or TCR) cluster. Overlaps are computed 3 ways (GEX KNN vs TCR KNN,
GEX KNN vs TCR cluster, and GEX cluster vs TCR KNN), for each of the
K values (called nbr_fracs short for neighbor fractions).
Columns (depend slightly on whether hit is KNN v KNN or KNN v cluster):
conga_score = P value for GEX/TCR overlap * number of clonotypes
mait_fraction = fraction of the overlap made up of 'invariant' T cells
num_neighbors* = size of neighborhood (K)
cluster_size = size of cluster (for KNN v cluster graph overlaps)
clone_index = 0-index of clonotype in adata object
| conga_score |
num_neighbors_gex |
num_neighbors_tcr |
overlap |
overlap_corrected |
mait_fraction |
clone_index |
nbr_frac |
graph_overlap_type |
cluster_size |
gex_cluster |
tcr_cluster |
va |
ja |
cdr3a |
vb |
jb |
cdr3b |
| 0.001559 |
NaN |
54.0 |
14 |
12 |
0.000000 |
3634 |
0.01 |
gex_cluster_vs_tcr_nbr |
205.0 |
9 |
5 |
TRAV35*01 |
TRAJ53*01 |
CAGRLSGGSNYKLTF |
TRBV11-2*01 |
TRBJ1-2*01 |
CASSLTGNYGYTF |
| 0.003253 |
54.0 |
54.0 |
8 |
7 |
0.000000 |
3528 |
0.01 |
gex_nbr_vs_tcr_nbr |
NaN |
2 |
5 |
TRAV35*01 |
TRAJ42*01 |
CAAVNYGGSQGNLIF |
TRBV5-1*01 |
TRBJ2-2*01 |
CASSPRTGGSTGELFF |
| 0.007744 |
545.0 |
NaN |
108 |
108 |
0.000000 |
4928 |
0.10 |
gex_nbr_vs_tcr_cluster |
708.0 |
0 |
1 |
TRAV8-4*01 |
TRAJ54*01 |
CAVSDRQGAQKLVF |
TRBV4-1*01 |
TRBJ2-1*01 |
CASRSGWANEQFF |
| 0.015643 |
545.0 |
545.0 |
87 |
87 |
0.011494 |
4966 |
0.10 |
gex_nbr_vs_tcr_nbr |
NaN |
2 |
12 |
TRAV8-6*01 |
TRAJ13*01 |
CAVITSGGYQKVTF |
TRBV20-1*01 |
TRBJ1-6*01 |
CSARDRTESSYNSPLHF |
| 0.023260 |
NaN |
54.0 |
13 |
12 |
0.000000 |
3556 |
0.01 |
gex_cluster_vs_tcr_nbr |
257.0 |
6 |
5 |
TRAV35*01 |
TRAJ42*01 |
CAGMNYGGSQGNLIF |
TRBV11-2*01 |
TRBJ1-2*01 |
CASSQREGTLYGYTF |
| 0.029243 |
545.0 |
545.0 |
86 |
86 |
0.011628 |
2403 |
0.10 |
gex_nbr_vs_tcr_nbr |
NaN |
5 |
2 |
TRAV24*01 |
TRAJ44*01 |
CAPGTASKLTF |
TRBV20-1*01 |
TRBJ1-1*01 |
CSAREQRDTMNTEAFF |
| 0.051957 |
NaN |
545.0 |
98 |
98 |
0.010204 |
4361 |
0.10 |
gex_cluster_vs_tcr_nbr |
652.0 |
2 |
12 |
TRAV8-1*01 |
TRAJ12*01 |
CAVTPGADSSYKLIF |
TRBV20-1*01 |
TRBJ2-3*01 |
CSALGVAGMGDGTQYF |
| 0.052461 |
NaN |
54.0 |
17 |
16 |
0.000000 |
3463 |
0.01 |
gex_cluster_vs_tcr_nbr |
493.0 |
5 |
5 |
TRAV35*01 |
TRAJ17*01 |
CAGQLYKAAGNKLTF |
TRBV19*01 |
TRBJ2-3*01 |
CASSQGGLGVHF |
| 0.054839 |
NaN |
54.0 |
14 |
10 |
0.000000 |
3532 |
0.01 |
gex_cluster_vs_tcr_nbr |
204.0 |
9 |
5 |
TRAV35*01 |
TRAJ42*01 |
CAGKNYGGSQGNLIF |
TRBV7-3*01 |
TRBJ2-3*01 |
CASSLRGDTQYF |
| 0.055734 |
54.0 |
54.0 |
7 |
6 |
0.000000 |
3533 |
0.01 |
gex_nbr_vs_tcr_nbr |
NaN |
9 |
5 |
TRAV35*01 |
TRAJ42*01 |
CAGKNYGGSQGNLIF |
TRBV7-3*01 |
TRBJ1-2*01 |
CASSPGPGSPYGYTF |
| 0.057226 |
NaN |
54.0 |
14 |
10 |
0.000000 |
3623 |
0.01 |
gex_cluster_vs_tcr_nbr |
205.0 |
9 |
5 |
TRAV35*01 |
TRAJ53*01 |
CAGLNSGGSNYKLTF |
TRBV6-4*01 |
TRBJ1-2*01 |
CASSARSGPLAGYTF |
| 0.062457 |
545.0 |
NaN |
79 |
73 |
0.000000 |
2856 |
0.10 |
gex_nbr_vs_tcr_cluster |
461.0 |
5 |
5 |
TRAV27*01 |
TRAJ17*01 |
CAGAKAAGNKLTF |
TRBV7-2*01 |
TRBJ1-6*01 |
CASSLRTGGDNSPLHF |
| 0.064224 |
545.0 |
NaN |
105 |
104 |
0.000000 |
4586 |
0.10 |
gex_nbr_vs_tcr_cluster |
708.0 |
0 |
1 |
TRAV8-3*01 |
TRAJ23*01 |
CVIINQGGKLIF |
TRBV24-1*01 |
TRBJ1-2*01 |
CATSKDRVYGYTF |
| 0.066483 |
NaN |
54.0 |
13 |
10 |
0.000000 |
3538 |
0.01 |
gex_cluster_vs_tcr_nbr |
203.0 |
9 |
5 |
TRAV35*01 |
TRAJ42*01 |
CAGLNYGGSQGNLIF |
TRBV11-2*01 |
TRBJ1-2*01 |
CASSSRANGLNGYTF |
| 0.069488 |
54.0 |
54.0 |
6 |
6 |
0.000000 |
3623 |
0.01 |
gex_nbr_vs_tcr_nbr |
NaN |
9 |
5 |
TRAV35*01 |
TRAJ53*01 |
CAGLNSGGSNYKLTF |
TRBV6-4*01 |
TRBJ1-2*01 |
CASSARSGPLAGYTF |
| 0.076880 |
NaN |
54.0 |
12 |
10 |
0.000000 |
3574 |
0.01 |
gex_cluster_vs_tcr_nbr |
201.0 |
9 |
5 |
TRAV35*01 |
TRAJ42*01 |
CAGQNYGGSQGNLIF |
TRBV6-2*01 |
TRBJ1-5*01 |
CASSYSQGQPQHF |
| 0.078514 |
NaN |
545.0 |
98 |
97 |
0.010204 |
2480 |
0.10 |
gex_cluster_vs_tcr_nbr |
652.0 |
2 |
2 |
TRAV25*01 |
TRAJ38*01 |
CAGDNAGNNRKLIW |
TRBV20-1*01 |
TRBJ1-5*01 |
CSALNQGQYSNQPQHF |
| 0.096380 |
545.0 |
NaN |
28 |
28 |
0.000000 |
4963 |
0.10 |
gex_nbr_vs_tcr_cluster |
122.0 |
2 |
12 |
TRAV8-6*01 |
TRAJ11*01 |
CAVSLGPSGYSTLTF |
TRBV20-1*01 |
TRBJ2-3*01 |
CSAIDRGQGDTQYF |
| 0.138730 |
NaN |
54.0 |
12 |
11 |
0.000000 |
3571 |
0.01 |
gex_cluster_vs_tcr_nbr |
256.0 |
6 |
5 |
TRAV35*01 |
TRAJ42*01 |
CAGQNYGGSQGNLIF |
TRBV5-5*01 |
TRBJ2-1*01 |
CASSPRLAGSSYNEQFF |
| 0.138730 |
NaN |
54.0 |
12 |
11 |
0.000000 |
3577 |
0.01 |
gex_cluster_vs_tcr_nbr |
256.0 |
6 |
5 |
TRAV35*01 |
TRAJ42*01 |
CAGQNYGGSQGNLIF |
TRBV7-8*01 |
TRBJ1-2*01 |
CASSPRQGAINGYTF |
| 0.171267 |
NaN |
54.0 |
11 |
11 |
0.000000 |
3547 |
0.01 |
gex_cluster_vs_tcr_nbr |
256.0 |
6 |
5 |
TRAV35*01 |
TRAJ42*01 |
CAGLNYGGSQGNLIF |
TRBV6-1*01 |
TRBJ1-2*01 |
CASSGRQGALYGYTF |
| 0.174073 |
545.0 |
545.0 |
83 |
83 |
0.000000 |
4563 |
0.10 |
gex_nbr_vs_tcr_nbr |
NaN |
0 |
1 |
TRAV8-3*01 |
TRAJ15*01 |
CAVGGNQAGTALIF |
TRBV4-2*01 |
TRBJ1-1*01 |
CASSQKGARGTEAFF |
| 0.183396 |
545.0 |
NaN |
76 |
75 |
0.000000 |
1102 |
0.10 |
gex_nbr_vs_tcr_cluster |
482.0 |
2 |
2 |
TRAV13-2*01 |
TRAJ8*01 |
CAENTGFQKLVF |
TRBV20-1*01 |
TRBJ1-5*01 |
CSARIGQDQPQHF |
| 0.185694 |
545.0 |
NaN |
103 |
102 |
0.000000 |
4944 |
0.10 |
gex_nbr_vs_tcr_cluster |
708.0 |
0 |
1 |
TRAV8-4*01 |
TRAJ8*01 |
CAVSDRLGTGFQKLVF |
TRBV25-1*01 |
TRBJ1-5*01 |
CASSDGVSQPQHF |
| 0.213459 |
545.0 |
NaN |
102 |
102 |
0.000000 |
5021 |
0.10 |
gex_nbr_vs_tcr_cluster |
708.0 |
4 |
1 |
TRAV8-6*01 |
TRAJ32*01 |
CAVTPMGGATNKLIF |
TRBV5-5*01 |
TRBJ1-5*01 |
CASSPRDSRNQPQHF |
| 0.213459 |
545.0 |
NaN |
102 |
102 |
0.000000 |
5116 |
0.10 |
gex_nbr_vs_tcr_cluster |
708.0 |
10 |
1 |
TRAV8-6*01 |
TRAJ9*01 |
CAVSGGTGGFKTIF |
TRBV19*01 |
TRBJ2-7*01 |
CASRPTSGSLDEQYF |
| 0.213459 |
545.0 |
NaN |
102 |
102 |
0.000000 |
5032 |
0.10 |
gex_nbr_vs_tcr_cluster |
708.0 |
4 |
1 |
TRAV8-6*01 |
TRAJ37*01 |
CAVLGTGSSNTGKLIF |
TRBV6-2*01 |
TRBJ2-7*01 |
CASRQTLLGEQYF |
| 0.213459 |
545.0 |
NaN |
102 |
102 |
0.000000 |
4868 |
0.10 |
gex_nbr_vs_tcr_cluster |
708.0 |
4 |
1 |
TRAV8-4*01 |
TRAJ43*01 |
CAVSAYNNNDMRF |
TRBV6-2*01 |
TRBJ2-7*01 |
CASNGGGAGEDEQYF |
| 0.231893 |
NaN |
545.0 |
96 |
95 |
0.000000 |
2869 |
0.10 |
gex_cluster_vs_tcr_nbr |
652.0 |
2 |
2 |
TRAV27*01 |
TRAJ24*01 |
CAGARTTDSWGKLQF |
TRBV20-1*01 |
TRBJ2-2*01 |
CSASTSGNTGELFF |
| 0.245305 |
NaN |
545.0 |
86 |
86 |
0.000000 |
1325 |
0.10 |
gex_cluster_vs_tcr_nbr |
575.0 |
3 |
14 |
TRAV16*01 |
TRAJ52*01 |
CALSGRGGGGTSYGKLTF |
TRBV18*01 |
TRBJ2-7*01 |
CASSPPGTEVQYF |
| 0.260769 |
NaN |
545.0 |
98 |
94 |
0.000000 |
3464 |
0.10 |
gex_cluster_vs_tcr_nbr |
652.0 |
2 |
5 |
TRAV35*01 |
TRAJ17*01 |
CAGQLYRAAGNKLTF |
TRBV19*01 |
TRBJ1-2*01 |
CASSPAPGQGSIYGYTF |
| 0.267568 |
545.0 |
NaN |
75 |
71 |
0.000000 |
3633 |
0.10 |
gex_nbr_vs_tcr_cluster |
460.0 |
6 |
5 |
TRAV35*01 |
TRAJ53*01 |
CAGRLSGGSNYKLTF |
TRBV11-2*01 |
TRBJ1-2*01 |
CASSLTGNYGYTF |
| 0.270479 |
545.0 |
NaN |
27 |
27 |
0.000000 |
4453 |
0.10 |
gex_nbr_vs_tcr_cluster |
122.0 |
5 |
12 |
TRAV8-1*01 |
TRAJ50*01 |
CAVNGKTSYDKVIF |
TRBV20-1*01 |
TRBJ1-2*01 |
CSAPIGRGNYGYTF |
| 0.292462 |
545.0 |
NaN |
118 |
118 |
0.000000 |
4224 |
0.10 |
gex_nbr_vs_tcr_cluster |
852.0 |
8 |
0 |
TRAV5*01 |
TRAJ39*01 |
CAESIHAGNMLTF |
TRBV11-2*01 |
TRBJ2-3*01 |
CASSLERNAAGADTQYF |
| 0.292820 |
NaN |
54.0 |
14 |
9 |
0.000000 |
3542 |
0.01 |
gex_cluster_vs_tcr_nbr |
203.0 |
9 |
5 |
TRAV35*01 |
TRAJ42*01 |
CAGLNYGGSQGNLIF |
TRBV6-1*01 |
TRBJ1-2*01 |
CASITKDRGFGYTF |
| 0.314597 |
NaN |
54.0 |
14 |
9 |
0.000000 |
3593 |
0.01 |
gex_cluster_vs_tcr_nbr |
205.0 |
9 |
5 |
TRAV35*01 |
TRAJ42*01 |
CAVMNYGGSQGNLIF |
TRBV5-1*01 |
TRBJ1-2*01 |
CASSAGRGDGYTF |
| 0.314597 |
NaN |
54.0 |
14 |
9 |
0.000000 |
3533 |
0.01 |
gex_cluster_vs_tcr_nbr |
205.0 |
9 |
5 |
TRAV35*01 |
TRAJ42*01 |
CAGKNYGGSQGNLIF |
TRBV7-3*01 |
TRBJ1-2*01 |
CASSPGPGSPYGYTF |
| 0.314597 |
NaN |
54.0 |
14 |
9 |
0.000000 |
3587 |
0.01 |
gex_cluster_vs_tcr_nbr |
205.0 |
9 |
5 |
TRAV35*01 |
TRAJ42*01 |
CAGRNYGGSQGNLIF |
TRBV7-3*01 |
TRBJ2-3*01 |
CASSPRHGTDTQYF |
| 0.347869 |
545.0 |
NaN |
77 |
70 |
0.000000 |
3390 |
0.10 |
gex_nbr_vs_tcr_cluster |
461.0 |
9 |
5 |
TRAV30*01 |
TRAJ33*01 |
CGTALSNYQLIW |
TRBV9*01 |
TRBJ2-1*01 |
CASSLLDLRYNEQFF |
| 0.386918 |
NaN |
54.0 |
13 |
9 |
0.000000 |
3550 |
0.01 |
gex_cluster_vs_tcr_nbr |
205.0 |
9 |
5 |
TRAV35*01 |
TRAJ42*01 |
CAGLNYGGSQGNLIF |
TRBV14*01 |
TRBJ2-5*01 |
CASSKRQHSPAETQYF |
| 0.410000 |
NaN |
54.0 |
12 |
9 |
0.000000 |
3572 |
0.01 |
gex_cluster_vs_tcr_nbr |
201.0 |
9 |
5 |
TRAV35*01 |
TRAJ42*01 |
CAGQNYGGSQGNLIF |
TRBV6-1*01 |
TRBJ1-2*01 |
CASFRGGVNGYTF |
| 0.452737 |
545.0 |
NaN |
75 |
70 |
0.000000 |
2855 |
0.10 |
gex_nbr_vs_tcr_cluster |
461.0 |
5 |
5 |
TRAV27*01 |
TRAJ16*01 |
CAGRFSDGQKLLF |
TRBV5-1*01 |
TRBJ2-3*01 |
CASSPPGGSTDTQYF |
| 0.455230 |
NaN |
545.0 |
139 |
138 |
0.000000 |
77 |
0.10 |
gex_cluster_vs_tcr_nbr |
1041.0 |
0 |
3 |
TRAV1-2*01 |
TRAJ9*01 |
CAVRETGGFKTIF |
TRBV3-1*01 |
TRBJ2-5*01 |
CASSQASGGRETQYF |
| 0.466759 |
545.0 |
NaN |
117 |
117 |
0.000000 |
1005 |
0.10 |
gex_nbr_vs_tcr_cluster |
852.0 |
8 |
0 |
TRAV13-2*01 |
TRAJ3*01 |
CAEKMRGSSASKIIF |
TRBV5-5*01 |
TRBJ2-3*01 |
CASSGGGWADTQYF |
| 0.499536 |
NaN |
54.0 |
11 |
9 |
0.000000 |
3575 |
0.01 |
gex_cluster_vs_tcr_nbr |
201.0 |
9 |
5 |
TRAV35*01 |
TRAJ42*01 |
CAGQNYGGSQGNLIF |
TRBV6-6*01 |
TRBJ1-2*01 |
CASSKRGDYGYTF |
| 0.499536 |
NaN |
54.0 |
11 |
9 |
0.000000 |
3573 |
0.01 |
gex_cluster_vs_tcr_nbr |
201.0 |
9 |
5 |
TRAV35*01 |
TRAJ42*01 |
CAGQNYGGSQGNLIF |
TRBV6-2*01 |
TRBJ1-2*01 |
CASSPTRGALVGYTF |
| 0.499536 |
NaN |
54.0 |
11 |
9 |
0.000000 |
3567 |
0.01 |
gex_cluster_vs_tcr_nbr |
201.0 |
9 |
5 |
TRAV35*01 |
TRAJ42*01 |
CAGQNYGGSQGNLIF |
TRBV11-2*01 |
TRBJ1-2*01 |
CASSPSRGSLGGYTF |
| 0.506932 |
545.0 |
545.0 |
85 |
80 |
0.000000 |
3577 |
0.10 |
gex_nbr_vs_tcr_nbr |
NaN |
6 |
5 |
TRAV35*01 |
TRAJ42*01 |
CAGQNYGGSQGNLIF |
TRBV7-8*01 |
TRBJ1-2*01 |
CASSPRQGAINGYTF |
| 0.515636 |
545.0 |
NaN |
74 |
70 |
0.000000 |
2440 |
0.10 |
gex_nbr_vs_tcr_cluster |
461.0 |
5 |
5 |
TRAV25*01 |
TRAJ21*01 |
CAATYNFNKFYF |
TRBV6-1*01 |
TRBJ2-1*01 |
CASSLTREQFF |
| 0.569272 |
NaN |
545.0 |
95 |
93 |
0.000000 |
3439 |
0.10 |
gex_cluster_vs_tcr_nbr |
652.0 |
2 |
5 |
TRAV35*01 |
TRAJ13*01 |
CAGQNSGGYQKVTF |
TRBV27*01 |
TRBJ1-5*01 |
CASSLYGYRGFGQPQHF |
Omitted 39 lines
graph_vs_graph_logos
This figure summarizes the results of a CoNGA analysis that produces
scores (CoNGA) and clusters. At the top are six
2D UMAP projections of clonotypes in the dataset based on GEX similarity
(top left three panels) and TCR similarity (top right three panels),
colored from left to right by GEX cluster assignment;
CoNGA score; joint GEX:TCR cluster assignment for
clonotypes with significant CoNGA scores,
using a bicolored disk whose left half indicates GEX cluster and whose right
half indicates TCR cluster; TCR cluster; CoNGA; GEX:TCR cluster
assignments for CoNGA hits, as in the third panel.
Below are two rows of GEX landscape plots colored by (first row, left)
expression of selected marker genes, (second row, left) Z-score normalized and
GEX-neighborhood averaged expression of the same marker genes, and
(both rows, right) TCR sequence features (see CoNGA manuscript Table S3 for
TCR feature descriptions).
GEX and TCR sequence features of CoNGA hits in clusters with
5 or more hits are summarized by a series
of logo-style visualizations, from left to right:
differentially expressed genes (DEGs); TCR sequence logos showing the V and
J gene usage and CDR3 sequences for the TCR alpha and beta chains; biased
TCR sequence scores, with red indicating elevated scores and blue indicating
decreased scores relative to the rest of the dataset (see CoNGA manuscript
Table S3 for score definitions); GEX 'logos' for each cluster
consisting of a panel of marker genes shown with red disks colored by
mean expression and sized according to the fraction of cells expressing
the gene (gene names are given above).
DEG and TCRseq sequence logos are scaled
by the adjusted P value of the associations, with full logo height requiring
a top adjusted P value below 10-6. DEGs with fold-change less than 2 are shown
in gray. Each cluster is indicated by a bicolored disk colored according to
GEX cluster (left half) and TCR cluster (right half). The two numbers above
each disk show the number of hits within the cluster (on the left) and
the total number of cells in those clonotypes (on the right). The dendrogram
at the left shows similarity relationships among the clusters based on
connections in the GEX and TCR neighbor graphs.
The choice of which marker genes to use for the GEX umap panels and for the
cluster GEX logos can be configured using run_conga.py command line flags
or arguments to the conga.plotting.make_logo_plots function.
Image source: ./CoNGA.output_graph_vs_graph_logos.png